Most Victorian Population is concentrated in the Melbourne City Region. Other regions Though large have a less population
| SA4_CODE_2016 | femalepopulation | malepopulation | population |
|---|---|---|---|
| 201 | 32726 | 34691 | 67417 |
| 202 | 32396 | 34054 | 66450 |
| 203 | 60660 | 64307 | 124967 |
| 204 | 35934 | 39614 | 75548 |
| 205 | 52929 | 57572 | 110501 |
| 206 | 159362 | 160819 | 320181 |
| 207 | 81814 | 86786 | 168600 |
| 208 | 96482 | 101671 | 198153 |
| 209 | 109370 | 122195 | 231565 |
| 210 | 71224 | 85167 | 156391 |
| 211 | 118179 | 129501 | 247680 |
| 212 | 151481 | 184164 | 335645 |
| 213 | 147830 | 178340 | 326170 |
| 214 | 62731 | 68190 | 130921 |
| 215 | 29867 | 33492 | 63359 |
| 216 | 25915 | 28796 | 54711 |
| 217 | 26236 | 29297 | 55533 |
| 297 | 0 | 9 | 9 |
| 299 | 765 | 1229 | 1994 |
Highest people are are Health Care Professionals and the ratio between men to women is less than one.
Similarly, in construction more men are employed as labourers.
The population of women in the education sector is far exceeds that of men.
Management & Commerce is the field that the most population have studied.
More men have studied Engineering and Technology as compared to females. However, more people are employed in Health Care than in industries relating to Engineering.
More women have studied Management and Commerce, however more men are employed as managers.
Victorian population is educated upto level 7 and most are employed as professionals.
However, a large population is employed as labourers when the population share of people who studied below high school is very less.
GenderLinearModel shows the relationship between male and female populations
Most of the residents achieved the level 7, which refers to the bachelor degree, and there are almost twice as many female as male.
Majority of male residents achieved at the level 3 and 4.
| afq_level | age_min | population |
|---|---|---|
| Level 1 & 2 | 15 | 9402 |
| Level 3 & 4 | 25 | 146297 |
| Level 5 & 6 | 25 | 96920 |
| Level 7 | 25 | 245613 |
| Level 9 | 25 | 83204 |
| Not Stated | 25 | 70455 |
| Level 8 | 35 | 28908 |
| industry | age_min | population |
|---|---|---|
| Accommodation_and_food_services | 25 | 42103 |
| Administrative_and_support_services | 25 | 23086 |
| Arts_and_recreation_services | 25 | 13149 |
| Construction | 25 | 61959 |
| Electricity_gas_water_and_waste_service | 25 | 8039 |
| Financial_and_insurance_services | 25 | 32021 |
| Health_care_and_social_assistance | 25 | 80994 |
| Information_media_and_telecommunications | 25 | 14702 |
| Not Stated | 25 | 29901 |
| Other_services | 25 | 24089 |
| Professional_scientific_and_technical_services | 25 | 64125 |
| Rental_hiring_and_real_estate_services | 25 | 11796 |
| Retail_trade | 25 | 61803 |
| Mining | 35 | 2441 |
| Wholesale_trade | 35 | 22199 |
| Education_and_training | 45 | 56125 |
| Manufacturing | 45 | 55206 |
| Public_administration_and_safety | 45 | 37747 |
| Transport_postal_and_warehousing | 45 | 32663 |
| Agriculture_forestry_and_fishing | 55 | 12733 |
| field | age_min | population |
|---|---|---|
| Mixed_Field_Programmes | 15 | 1813 |
| Architecture_and_Building | 25 | 42510 |
| Creative_Arts | 25 | 40334 |
| Food_Hospitality_and_Personal_Services | 25 | 42938 |
| Health | 25 | 67630 |
| Information_Technology | 25 | 37535 |
| Management_and_Commerce | 25 | 150571 |
| Natural_and_Physical_Sciences | 25 | 22171 |
| Not Stated | 25 | 71440 |
| Society_and_Culture | 25 | 80932 |
| Agriculture_Environment | 35 | 13016 |
| Engineering_and_Technologies | 45 | 77524 |
| Education | 55 | 44696 |
| NA | NA | 896 |
| occupation | age_min | population |
|---|---|---|
| Community_and_personal_service_workers | 25 | 67104 |
| Not Stated | 25 | 11075 |
| Professionals | 25 | 190449 |
| Sales_workers | 25 | 51772 |
| Technicians_and_trades_workers | 25 | 99110 |
| Managers | 35 | 100601 |
| Clerical_and_administrative_workers | 45 | 89021 |
| Labourers | 45 | 49653 |
| Machinery_operators_and_drivers | 45 | 40922 |
Best education level of each region
Best field of each region
Spatial Education Level Distribution
Spatial Industry Distribution
Spatial Study Field Distribution
Spatial Occupation Distribution
---
title: "ETC5513 Assignment4 -Team StarWars"
output:
flexdashboard::flex_dashboard:
orientation: columns
vertical_layout: fill
navbar:
- { title: "About", href: "https://github.com/mohammedfaizan0014/etc5513-assignment-4-star-wars/blob/main/README.md", align: left }
social: [ "twitter", "facebook", "menu" ]
source_code: embed
---
```{r echo=FALSE, include=FALSE}
knitr::opts_chunk$set(fig.path = "Figures/", fig.align ="center",
out.width = "50%", echo = FALSE,
messages = FALSE,
warning = FALSE)
# Loading Libraries
library(tidyverse)
library(readr)
library(kableExtra)
library(tinytex)
library(bookdown)
library(naniar)
library(visdat)
library(citation)
library(knitr)
library(scales)
library(patchwork)
library(sf)
library(glue)
library(unglue)
library(sugarbag)
library(readxl)
library(plotly)
library(tidytext)
library(ggplot2)
```
```{r}
data_path <- here::here("data/australian_census_data_2016/")
```
```{r}
data_path <- here::here("data/australian_census_data_2016/")
census_paths <- glue::glue(data_path, "/2016 Census GCP All Geographies for VIC/SA4/VIC/2016Census_G{number}{alpha}_VIC_SA4.csv",
number = c("46","46","47","47","47","51","51","51","51","57","57", "52", "52", "52", "52", "58", "58"), alpha = c("A","B","A","B","C","A","B","C","D","A","B", "A","B","C","D", "A","B"))
```
```{r geopath}
geopath <- glue::glue(data_path, "/2016_SA4_shape/SA4_2016_AUST.shp")
sa4_codes<- read_csv(census_paths[2]) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
select(SA4_CODE_2016)
sa4_geomap <- read_sf(geopath) %>%
right_join(sa4_codes, by=c("SA4_CODE16" = "SA4_CODE_2016"))
```
```{r g46read}
g46a<- read_csv(census_paths[1]) %>%
select(-starts_with("P"), -contains("Tot"), -contains("nfd"), -contains("IDes")) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
pivot_longer(cols = -c(SA4_CODE_2016),
names_to = "category",
values_to = "count") %>%
unglue_unnest(category,
c("{sex=[MF]}_{educationlevel=GradDip_and_GradCert}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=PGrad_Deg}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=BachDeg}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=AdvDip_and_Dip}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=Cert_III_IV}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=Cert_I_II}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=Lev_Edu_NS}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=Lev_Edu_NS|GradDip_and_GradCert|PGrad_Deg|BachDeg|AdvDip_and_Dip|Cert_III_IV|Cert_I_II}_{age_min=\\d+}ov"
),
remove = FALSE) %>%
select(-category)
```
```{r}
g46a <- g46a %>%
mutate(afq_level =case_when(str_detect(educationlevel, "GradDip_and_GradCert") ~ "Level 8",
str_detect(educationlevel, "PGrad") ~ "Level 9",
str_detect(educationlevel, "BachDeg") ~ "Level 7",
str_detect(educationlevel, "AdvDip_and_Dip") ~ "Level 5 & 6",
str_detect(educationlevel, "Cert_III_IV") ~ "Level 3 & 4",
str_detect(educationlevel, "Cert_I_II") ~ "Level 1 & 2",
str_detect(educationlevel, "Cert_Levl_nfd") ~ "Level 3 & 4",
str_detect(educationlevel, "Lev_Edu_IDes") ~ "Level Inadequately Described",
str_detect(educationlevel, "Lev_Edu_NS") ~ "Not Stated",
TRUE ~ educationlevel)) %>%
rename(count_edu_lvl = count)
```
```{r}
g47 <- map_dfr(census_paths[3:4], ~{
df <- read_csv(.x) %>%
select(-starts_with("P"), -contains("Tot"), -contains("InadDes")) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
pivot_longer(cols = -c(SA4_CODE_2016),
names_to = "category",
values_to = "count") %>%
unglue_unnest(category,
c("{sex=[MF]}_{field=(Mgnt_Com|Society_Cult|Fd_Hosp_Psnl_Svcs|MixFld_Prgm|FldStd_NS|NatPhyl_Scn|InfoTech|Eng_RelTec|ArchtBldng|Ag_Envir_Rltd_Sts|Health|Educ|Creative_Arts)}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{field=(Mgnt_Com|Society_Cult|Fd_Hosp_Psnl_Svcs|MixFld_Prgm|FldStd_NS|NatPhyl_Scn|InfoTech|Eng_RelTec|ArchtBldng|Ag_Envir_Rltd_Sts|Health|Educ|Creative_Arts)}_{age_min=\\d+}ov",
"{sex=[MF]}_{field=(Mgnt_Com|Society_Cult|Fd_Hosp_Psnl_Svcs|MixFld_Prgm|FldStd_NS|N{atPhyl_Scn|InfoTech|Eng_RelTec|ArchtBldng|Ag_Envir_Rltd_Sts|Health|Educ|Creative_Arts)}_{age_min=\\d+}_years_and_over"
),
remove = FALSE)
})
```
```{r}
g47 <- g47 %>%
mutate(field =case_when(
str_detect(field, "NatPhyl_Scn") ~ "Natural_and_Physical_Sciences",
str_detect(field, "InfoTech") ~ "Information_Technology",
str_detect(field, "Eng_RelTec") ~ "Engineering_and_Technologies",
str_detect(field, "ArchtBldng") ~ "Architecture_and_Building",
str_detect(field, "Ag_Envir_Rltd_Sts") ~ "Agriculture_Environment",
str_detect(field, "Health") ~ "Health",
str_detect(field, "Educ") ~ "Education",
str_detect(field, "Mgnt_Com") ~ "Management_and_Commerce",
str_detect(field, "Society_Cult") ~ "Society_and_Culture",
str_detect(field, "Creative_Arts") ~ "Creative_Arts",
str_detect(field, "Fd_Hosp_Psnl_Svcs") ~ "Food_Hospitality_and_Personal_Services",str_detect(field, "MixFld_Prgm") ~ "Mixed_Field_Programmes",
str_detect(field, "FldStd_NS") ~ "Not Stated",
TRUE ~ field)) %>%
select(-category) %>%
rename(count_field = count)
```
```{r}
g51 <- map_dfr(census_paths[6:8], ~{
df <- read_csv(.x) %>%
select(-starts_with("P"), -contains("Tot")) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
pivot_longer(cols = -c(SA4_CODE_2016),
names_to = "category",
values_to = "count") %>%
unglue_unnest(category,
c("{sex=[MF]}_{industry=(Ag_For_Fshg|Mining|Manufact|El_Gas_Wt_Waste|Constru|WhlesaleTde|RetTde|Accom_food|Trans_post_wrehsg|Info_media_teleco|Fin_Insur|RtnHir_REst|Pro_scien_tec|Admin_supp|Public_admin_sfty|Educ_trng|HlthCare_SocAs|Art_recn|Oth_scs|ID_NS)}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{industry=(Ag_For_Fshg|Mining|Manufact|El_Gas_Wt_Waste|Constru|WhlesaleTde|RetTde|Accom_food|Trans_post_wrehsg|Info_media_teleco|Fin_Insur|RtnHir_REst|Pro_scien_tec|Admin_supp|Public_admin_sfty|Educ_trng|HlthCare_SocAs|Art_recn|Oth_scs|ID_NS)}_{age_min=\\d+}ov"
),
remove = FALSE)
})
```
```{r}
g51 <- g51 %>%
mutate(industry =case_when(
str_detect(industry, "Ag_For_Fshg") ~ "Agriculture_forestry_and_fishing",
str_detect(industry, "Manufact") ~ "Manufacturing",
str_detect(industry, "El_Gas_Wt_Waste") ~ "Electricity_gas_water_and_waste_service",
str_detect(industry, "Constru") ~ "Construction",
str_detect(industry, "Ag_Envir_Rltd_Sts") ~ "Agriculture_Environment",
str_detect(industry, "WhlesaleTde") ~ "Wholesale_trade",
str_detect(industry, "RetTde") ~ "Retail_trade",
str_detect(industry, "Accom_food") ~ "Accommodation_and_food_services",
str_detect(industry, "Trans_post_wrehsg") ~ "Transport_postal_and_warehousing",
str_detect(industry, "Info_media_teleco") ~ "Information_media_and_telecommunications",
str_detect(industry, "Fin_Insur") ~ "Financial_and_insurance_services",
str_detect(industry, "RtnHir_REst") ~ "Rental_hiring_and_real_estate_services",
str_detect(industry, "Pro_scien_tec") ~ "Professional_scientific_and_technical_services",
str_detect(industry, "Admin_supp") ~ "Administrative_and_support_services",
str_detect(industry, "Public_admin_sfty") ~ "Public_administration_and_safety",
str_detect(industry, "Educ_trng") ~ "Education_and_training",
str_detect(industry, "HlthCare_SocAs") ~ "Health_care_and_social_assistance",
str_detect(industry, "Art_recn") ~ "Arts_and_recreation_services",
str_detect(industry, "Oth_scs") ~ "Other_services",
str_detect(industry, "ID_NS") ~ "Not Stated",
TRUE ~ industry)) %>%
select(-category) %>%
rename(count_industry = count)
```
```{r}
g57 <- map_dfr(census_paths[10], ~{
df <- read_csv(.x) %>%
select(-starts_with("P"), -contains("Tot")) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
pivot_longer(cols = -c(SA4_CODE_2016),
names_to = "category",
values_to = "count") %>%
unglue_unnest(category,
c("{sex=[MF]}{age_min=\\d+}_{age_max=\\d+}_{occupation=(Managers|Professionals|TechnicTrades_Wrs|CommunPersnlSvc_W|ClericalAdminis_W|Sales_W|Mach_oper_drivers|Labourers|Occu_ID_NS|TechnicTrades_W)}",
"{sex=[MF]}{age_min=\\d+}ov_{occupation=(Managers|Professionals|TechnicTrades_Wrs|CommunPersnlSvc_W|ClericalAdminis_W|Sales_W|Mach_oper_drivers|Labourers|Occu_ID_NS|TechnicTrades_W)}",
"{sex=[MF]}{age_min=\\d+}_ov_{occupation=(Managers|Professionals|TechnicTrades_Wrs|CommunPersnlSvc_W|ClericalAdminis_W|Sales_W|Mach_oper_drivers|Labourers|Occu_ID_NS|TechnicTrades_W)}"
),
remove = FALSE)
})
```
```{r}
g57 <- g57 %>%
mutate(occupation =case_when(
str_detect(occupation, "TechnicTrades_W") ~ "Technicians_and_trades_workers",
str_detect(occupation, "TechnicTrades_Wrs") ~ "Technicians_and_trades_workers",
str_detect(occupation, "CommunPersnlSvc") ~ "Community_and_personal_service_workers",
str_detect(occupation, "ClericalAdminis_W") ~ "Clerical_and_administrative_workers",
str_detect(occupation, "Sales_W") ~ "Sales_workers",
str_detect(occupation, "Mach_oper_drivers") ~ "Machinery_operators_and_drivers",
str_detect(occupation, "Occu_ID_NS") ~ "Not Stated",
TRUE ~ occupation)) %>%
select(-category) %>%
rename(count_occupation = count)
```
```{r}
g52 <- map_dfr(census_paths[12:14], ~{
df <- read_csv(.x) %>%
select(-starts_with("P"), -contains("Tot")) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
pivot_longer(cols = -c(SA4_CODE_2016),
names_to = "category",
values_to = "count") %>%
unglue_unnest(category,
c("{sex=[MF]}_{industry=(AgriForestFish|Min|Mnfg|EGW_WS|Cnstn|WTrade|RTrade|AccomFoodS|TransPostWhse|InfoMedTelecom|FinInsurS|RentHirREserv|ProScieTechServ|AdminSupServ|PubAdmiSafety|EducTrain|HealthCareSocA|ArtRecServ|OthServ|ID_NS)}_{hr_min=\\d+}_{hr_max=\\d+}",
"{sex=[MF]}_{industry=(AgriForestFish|Min|Mnfg|EGW_WS|Cnstn|WTrade|RTrade|AccomFoodS|TransPostWhse|InfoMedTelecom|FinInsurS|RentHirREserv|ProScieTechServ|AdminSupServ|PubAdmiSafety|EducTrain|HealthCareSocA|ArtRecServ|OthServ|ID_NS)}_{hr_min=\\d+}",
"{sex=[MF]}_{industry=(AgriForestFish|Min|Mnfg|EGW_WS|Cnstn|WTrade|RTrade|AccomFoodS|TransPostWhse|InfoMedTelecom|FinInsurS|RentHirREserv|ProScieTechServ|AdminSupServ|PubAdmiSafety|EducTrain|HealthCareSocA|ArtRecServ|OthServ|ID_NS)}_{hr_min=\\d+}over"
),
remove = FALSE)
})
```
```{r}
g52 <- g52 %>%
mutate(industry =case_when(
str_detect(industry, "AgriForestFish") ~ "Agriculture_forestry_and_fishing",
str_detect(industry, "Min") ~ "Mining",
str_detect(industry, "Mnfg") ~ "Manufacturing",
str_detect(industry, "EGW_WS") ~ "Electricity_gas_water_and_waste_service",
str_detect(industry, "Cnstn") ~ "Construction",
str_detect(industry, "WTrade") ~ "Wholesale_trade",
str_detect(industry, "RTrade") ~ "Retail_trade",
str_detect(industry, "AccomFoodS") ~ "Accommodation_and_food_services",
str_detect(industry, "TransPostWhse") ~ "Transport_postal_and_warehousing",
str_detect(industry, "InfoMedTelecom") ~ "Information_media_and_telecommunications",
str_detect(industry, "FinInsurS") ~ "Financial_and_insurance_services",
str_detect(industry, "RentHirREserv") ~ "Rental_hiring_and_real_estate_services",
str_detect(industry, "ProScieTechServ") ~ "Professional_scientific_and_technical_services",
str_detect(industry, "AdminSupServ") ~ "Administrative_and_support_services",
str_detect(industry, "PubAdmiSafety") ~ "Public_administration_and_safety",
str_detect(industry, "EducTrain") ~ "Education_and_training",
str_detect(industry, "HealthCareSocA") ~ "Health_care_and_social_assistance",
str_detect(industry, "ArtRecServ") ~ "Arts_and_recreation_services",
str_detect(industry, "OthServ") ~ "Other_services",
str_detect(industry, "ID_NS") ~ "Not Stated",
TRUE ~ industry)) %>%
select(-category) %>%
rename(count_industry = count)
```
```{r}
g58 <- map_dfr(census_paths[16], ~{
df <- read_csv(.x) %>%
select(-starts_with("P"), -contains("Tot")) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
pivot_longer(cols = -c(SA4_CODE_2016),
names_to = "category",
values_to = "count") %>%
unglue_unnest(category,
c("{sex=[MF]}_{occupation=(Mng|Pro|TTW|CPS|CA|Sal|MOD|Lab|ID_NS|)}_{hrs_min=\\d+}_{hrs_max=\\d+}",
"{sex=[MF]}_{occupation=(Mng|Pro|TTW|CPS|CA|Sal|MOD|Lab|ID_NS|)}_{hrs_min=\\d+}",
"{sex=[MF]}_{occupation=(Mng|Pro|TTW|CPS|CA|Sal|MOD|Lab|ID_NS|)}_{hrs_min=\\d+}over"
),
remove = FALSE)
})
```
```{r}
g58 <- g58 %>%
mutate(occupation =case_when(
str_detect(occupation, "Mng") ~ "Manager",
str_detect(occupation, "Pro") ~ "Professionals",
str_detect(occupation, "TTW") ~ "Technicians_and_trades_workers",
str_detect(occupation, "TechnicTrades_Wrs") ~ "Technicians_and_trades_workers",
str_detect(occupation, "CPS") ~ "Community_and_personal_service_workers",
str_detect(occupation, "CA") ~ "Clerical_and_administrative_workers",
str_detect(occupation, "Sal") ~ "Sales_workers",
str_detect(occupation, "MOD") ~ "Machinery_operators_and_drivers",
str_detect(occupation, "ID_NS") ~ "Not Stated",
TRUE ~ occupation)) %>%
select(-category) %>%
rename(count_occupation = count)
```
Population Count {.storyboard}
=========================================
### Population Map
```{r}
vicpopulation <- g51 %>%
group_by(SA4_CODE_2016) %>%
summarise(population = sum(count_industry)) %>%
ungroup()
population <- vicpopulation %>%
summarise(population=sum(population))
vicpopulation <- g51 %>%
group_by(SA4_CODE_2016, sex) %>%
summarise(population = sum(count_industry)) %>%
ungroup(sex) %>%
pivot_wider(names_from = sex,
values_from = population) %>%
rename(malepopulation = M,
femalepopulation = `F`) %>%
full_join(vicpopulation)
```
```{r}
vicpopulation %>%
full_join(sa4_geomap,
by = c("SA4_CODE_2016"="SA4_CODE")) %>%
ggplot() +
geom_sf(mapping = aes(geometry= geometry, fill=population)) +
geom_sf_text(aes(geometry= geometry,label=SA4_CODE_2016, colour="white"),
check_overlap=TRUE)+
theme_void()
```
> Most Victorian Population is concentrated in the Melbourne City Region.
> Other regions Though large have a less population
### Population Table
```{r}
vicpopulation %>%
kable(caption = "Victoriqn Population") %>%
kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```
### Age Distribution
```{r agedistributiong57, fig.height=4}
g57redundantage <- g57[rep(rownames(g57), g57$count_occupation), ]
g57redundantage %>%
ggplot()+
geom_density(mapping = aes( x = as.numeric(age_min),
colour = sex,
alpha = 0.5)) +
labs(x="age") +
scale_x_continuous()
```
***
- Most population is Middle Aged, 20 to 50 years.
- Old people are vulnerable with a low population.
- Age distribution is similar for both male and female population.
GenderLinearModel {.hidden}
=====================================
Column
-----------------------------------------
### Occupation: Male vs Female
```{r}
g57 %>%
pivot_wider(names_from = sex,
values_from = count_occupation) %>%
ggplot(mapping = aes(x = M, y = `F`, colour = occupation)) +
geom_point() +
labs(title = "Population: Male vs Female",
x = "Male Population",
y= "Female Population") +
scale_y_continuous(label=label_number()) +
scale_x_continuous(label=label_number()) +
theme(legend.position = "bottom")
```
### Occupation: Male vs Female
```{r}
g57occmfnest <- g57 %>%
pivot_wider(names_from = sex,
values_from = count_occupation) %>%
select(occupation, `F`, M ) %>%
group_by(occupation) %>%
nest() %>%
mutate(model = map(data, lm)) %>%
mutate(aug = map(model, broomstick::augment)) %>%
unnest(aug)
mvfocc <- ggplot(g57occmfnest,
aes(x = M)) +
# index represent splitting value,
#geom_point(aes(y = `F`, colour = industry, aplha=0)) +
geom_line(aes(y = .fitted, colour = occupation)) +
theme(legend.position = "bottom") +
labs(title = "Population: Male vs Female",
x = "Male Population",
y= "Female Population")
# geom_text(aes(y = .fitted,label=industry, colour="white"),
# check_overlap=TRUE)
ggplotly(mvfocc) %>%
hide_legend()
```
Column
-----------------------------------------
### Industry: Male vs Female
```{r}
g51 %>%
pivot_wider(names_from = sex,
values_from = count_industry) %>%
ggplot(mapping = aes(x = M, y = `F`, colour = industry)) +
geom_point() +
labs(title = "Population: Male vs Female",
x = "Male Population",
y= "Female Population") +
scale_y_continuous(label=label_number()) +
scale_x_continuous(label=label_number()) +
theme(legend.position = "bottom")
```
### Industry: Male vs Female
```{r}
g51indmfnest <- g51 %>%
pivot_wider(names_from = sex,
values_from = count_industry) %>%
select(industry, `F`, M ) %>%
group_by(industry) %>%
nest() %>%
mutate(model = map(data, lm)) %>%
mutate(aug = map(model, broomstick::augment)) %>%
unnest(aug)
mvf <- ggplot(g51indmfnest,
aes(x = M)) +
# index represent splitting value,
#geom_point(aes(y = `F`, colour = industry, aplha=0)) +
geom_line(aes(y = .fitted, colour = industry)) +
theme(legend.position = "bottom") +
labs(title = "Population: Male vs Female",
x = "Male Population",
y= "Female Population")
# geom_text(aes(y = .fitted,label=industry, colour="white"),
# check_overlap=TRUE)
ggplotly(mvf) %>%
hide_legend()
```
Population: Gender {data-navmenu="Analysis"}
=========================================
Row
-----------------------------------------
- Highest people are are Health Care Professionals and the ratio between men to women is less than one.
- Similarly, in construction more men are employed as labourers.
- The population of women in the education sector is far exceeds that of men.
- Management & Commerce is the field that the most population have studied.
- More men have studied Engineering and Technology as compared to females. However, more people are employed in Health Care than in industries relating to Engineering.
- More women have studied Management and Commerce, however more men are employed as managers.
- Victorian population is educated upto level 7 and most are employed as professionals.
- However, a large population is employed as labourers when the population share of people who studied below high school is very less.
- [GenderLinearModel] shows the relationship between male and female populations
Column {.tabset}
-----------------------------------------
### Population by Education
- Most of the residents achieved the level 7, which refers to the bachelor degree, and there are almost twice as many female as male.
- Majority of male residents achieved at the level 3 and 4.
```{r}
g46a %>%
ggplot(mapping = aes(x = fct_reorder(afq_level,count_edu_lvl), y = count_edu_lvl, fill = sex)) +
geom_col(mapping = aes(x = reorder_within(afq_level,count_edu_lvl, sex), y = count_edu_lvl, fill = sex)) +
labs(title = "Population Share of education level", y = "Number of students") +
scale_y_continuous(label=label_number()) +
theme(axis.title.y = element_blank())+
coord_flip()
```
### Population by Industry
```{r}
g51 %>%
ggplot(mapping = aes(x = fct_reorder(industry,count_industry), y = count_industry, fill = sex)) +
geom_col(mapping = aes(x = reorder_within(industry,count_industry, sex), y = count_industry, fill = sex)) +
labs(title = "Population Share of Industries", y = "Number of Employees") +
scale_y_continuous(label=label_number()) +
theme(axis.title.y = element_blank())+
coord_flip()
```
Column {.tabset}
-----------------------------------------
### Population by Field
```{r}
ggplot(g47, aes(x = reorder_within(field,count_field,sex),
y = count_field,
fill = sex)) +
geom_col() +
labs(x = "Field",
y = "number of observations",
title = "Field by gender") +
scale_y_continuous(label=label_number()) +
theme(axis.title.y = element_blank())+
coord_flip()
```
### Population by Occupation
```{r}
g57 %>%
ggplot(mapping = aes(x = fct_reorder(occupation,count_occupation), y = count_occupation, fill = sex)) +
geom_col(mapping = aes(x = reorder_within(occupation,count_occupation, sex), y = count_occupation, fill = sex)) +
labs(title = "Population Share of Occupation", y = "Number of Employees") +
scale_y_continuous(label=label_number()) +
theme(axis.title.y = element_blank())+
coord_flip()
```
Population: Age {data-orientation=rows data-navmenu="Analysis"}
=========================================
Row {data-height=200}
-----------------------------------------
- As seen from the age distribution, all sectors have people in the age group 25 to 45.
- The age group, 25-35 shares the highest population in every sector.
- A key observation is that some people aged over 75 are still working.
Row {.tabset}
-----------------------------------------
### Population by Education
```{r}
g46redundantage <- g46a[rep(rownames(g46a), g46a$count_edu_lvl), ]
ageeducount <- g46redundantage %>%
ggplot(mapping = aes(x = age_min, y = afq_level)) +
geom_count() +
labs(title = "Population: Education level and Age", x = "Age") +
theme(axis.title.y = element_blank())
ggplotly(ageeducount)
```
### Population by Industries
```{r}
g51redundantage <- g51[rep(rownames(g51), g51$count_industry), ]
ageindcount <- g51redundantage %>%
ggplot(mapping = aes(x = age_min, y = industry)) +
geom_count() +
labs(title = "Population: Industries and Age", x = "Age") +
theme(axis.title.y = element_blank())
ggplotly(ageindcount)
```
Row {.tabset}
-----------------------------------------
### Population by Field
```{r}
g47redundantage <- g47[rep(rownames(g47), g47$count_field), ]
agefieldcount <- g47redundantage %>%
ggplot(mapping = aes(x = age_min, y = field)) +
geom_count() +
labs(title = "Population: Field and Age", x = "Age") +
theme(axis.title.y = element_blank())
ggplotly(agefieldcount)
```
### Population by Occupation
```{r}
g57redundantage <- g57[rep(rownames(g57), g57$count_occupation), ]
ageocccount <- g57redundantage %>%
ggplot(mapping = aes(x = age_min, y = occupation)) +
geom_count() +
labs(title = "Population: Occupation and Age", x = "Age") +
theme(axis.title.y = element_blank())
ggplotly(ageocccount)
```
Population: Age {data-orientation=rows data-navmenu="Analysis"}
=========================================
Row {.tabset}
-----------------------------------------
### Population by Education, Age
```{r}
g46a %>%
group_by(afq_level, age_min) %>%
summarise(population=sum(count_edu_lvl)) %>%
group_by(afq_level) %>%
slice_max(population, n=1) %>%
arrange(age_min)%>%
kable(caption = "Education: Population") %>%
kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```
### Population by Industries, Age
```{r}
g51 %>%
group_by(industry, age_min) %>%
summarise(population=sum(count_industry)) %>%
group_by(industry) %>%
slice_max(population, n=1) %>%
arrange(age_min)%>%
kable(caption = "Industry: Population") %>%
kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```
Row {.tabset}
-----------------------------------------
### Population by Field
```{r}
g47 %>%
group_by(field, age_min) %>%
summarise(population=sum(count_field)) %>%
group_by(field) %>%
slice_max(population, n=1) %>%
arrange(age_min)%>%
kable(caption = "Field: Population") %>%
kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```
### Population by Occupation
```{r}
g57%>%
group_by(occupation, age_min) %>%
summarise(population=sum(count_occupation)) %>%
group_by(occupation) %>%
slice_max(population, n=1) %>%
arrange(age_min)%>%
kable(caption = "Occupation: Population") %>%
kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```
Region : Sectors {data-orientation=rows data-navmenu="Regions"}
=========================================
Column
-----------------------------------------
### Education Level: Region
```{r bestedu, fig.cap="Best education level of each region"}
popareaedu <- g46a %>%
group_by(SA4_CODE_2016, afq_level) %>%
summarise(count_eduarea = sum(count_edu_lvl)) %>%
ungroup()
bestedu <- popareaedu %>%
select(1:3) %>%
group_by(afq_level) %>%
slice_max(count_eduarea) %>%
arrange(SA4_CODE_2016)
bestedu %>%
ggplot() +
geom_col(mapping = aes(x = reorder_within(afq_level,count_eduarea, SA4_CODE_2016), y = count_eduarea, fill = afq_level)) +
labs(title = "Region and Best education level",
x = "Fields with region code",
y = "Number of Students") +
scale_y_continuous(label=label_number()) +
coord_flip() +
theme(legend.position = "none")
```
### Industry: Region
```{r popareaind, fig.cap=""}
popindarea <- g51 %>%
group_by(SA4_CODE_2016, industry) %>%
summarise(count_indarea = sum(count_industry)) %>%
ungroup()
popareaindmax <- popindarea %>%
select(1:3) %>%
group_by(industry) %>%
slice_max(count_indarea) %>%
arrange(SA4_CODE_2016)
popareaindmax %>%
ggplot(mapping = aes(x = fct_reorder(industry,count_industry), y = count_industry, fill = sex)) +
geom_col(mapping = aes(x = reorder_within(industry,count_indarea, SA4_CODE_2016), y = count_indarea, fill = industry)) +
labs(title = "Region and Best Industry", y = "Number of Employees") +
scale_y_continuous(label=label_number()) +
coord_flip() +
theme(legend.position = "none")
```
Column
-----------------------------------------
### Field: Region
```{r bestfield, fig.cap="Best field of each region"}
popareafield <- g47 %>%
group_by(SA4_CODE_2016, field) %>%
summarise(count_fieldarea = sum(count_field)) %>%
ungroup()
bestfield <- popareafield %>%
select(1:3) %>%
group_by(field) %>%
slice_max(count_fieldarea) %>%
arrange(SA4_CODE_2016)
bestfield %>%
ggplot() +
geom_col(mapping = aes(x = reorder_within(field,count_fieldarea, SA4_CODE_2016), y = count_fieldarea, fill = field)) +
labs(title = "Region and Best Field",
x = "Fields with region code",
y = "Number of Students") +
scale_y_continuous(label=label_number()) +
coord_flip() +
theme(legend.position = "none")
```
### Occupation: Region
```{r popareaocc, fig.cap=""}
popoccarea <- g57 %>%
group_by(SA4_CODE_2016, occupation) %>%
summarise(count_occarea = sum(count_occupation)) %>%
ungroup()
popareaoccmax <- popoccarea %>%
select(1:3) %>%
group_by(occupation) %>%
slice_max(count_occarea) %>%
arrange(SA4_CODE_2016)
popareaoccmax %>%
ggplot(mapping = aes(x = fct_reorder(occupation,count_occarea), y = count_occarea, fill = sex)) +
geom_col(mapping = aes(x = reorder_within(occupation,count_occarea, SA4_CODE_2016), y = count_occarea, fill = occupation)) +
labs(title = "Region and Best Occupation", y = "Number of Employees") +
scale_y_continuous(label=label_number()) +
coord_flip() +
theme(legend.position = "none")
```
(G52 Analysis) {data-navmenu="G52"}
=============================
Row{data-width=420}
-----------------------------------------------------------------------
### Chart A
- It can be observed from both figures that overall females worked more than men. However, as the number of work-hours increased men have worked more than women.
```{r, include=FALSE}
p1 <- g52 %>%
mutate(hr_min = as.numeric(hr_min)) %>%
summarise(hr_min = sum(hr_min, na.rm = TRUE))
p2 <- g52 %>%
mutate(hr_max = as.numeric(hr_max)) %>%
summarise(hr_max = sum(hr_max, na.rm = TRUE))
```
```{r hr_plots, fig.show='hold', out.width="50%"}
p1 <- g52 %>%
ggplot(g52,
mapping = aes(x = hr_min,
y = count_industry,
fill = sex)) +
geom_bar(stat = "identity",
position = "dodge") +
theme_bw() +
xlab("Minimum Hours") +
ylab("Count") +
ggtitle("Min hours worked for Industries")
p1
p2 <- g52 %>%
ggplot(g52,
mapping = aes(x = hr_max,
y = count_industry,
fill = sex)) +
geom_bar(stat = "identity",
position = "dodge") +
theme_bw() +
xlab("Maximum Hours") +
ylab("Count") +
ggtitle("Max hours worked for Industries")
p2
```
Row{data-height=200}
-----------------------------------------------------------------------
### Chart B
- It can be observed from figure that industries like health care, education and training, construction and Professional and technical services have more working population as the working hours increased. Mining, electricity, gas, water showed low working population irrespective of work hours.
```{r ind_hrs}
g52redundanthrs <- g52[rep(rownames(g52), g52$count_industry), ]
hrindcount <- g52redundanthrs %>%
ggplot(mapping = aes(x = hr_min, y = industry)) +
geom_count() +
labs(title = "Population: Industries and hours", x = "Hours") +
theme(axis.title.y = element_blank())
ggplotly(hrindcount)
```
(G58 Analysis) {data-navmenu="G52"}
=============================
Row{data-width=400}
-----------------------------------------------------------------------
### Chart C
- It can be observed from figure that overall females worked more than men at all occupations. Although, for maximum hours worked, as number of working-hours increased, the number of men and women remained the same.
```{r, include=FALSE}
p3 <- g58 %>%
mutate(hrs_min = as.numeric(hrs_min)) %>%
summarise(hrs_min = sum(hrs_min, na.rm = TRUE))
p4 <- g58 %>%
mutate(hrs_max = as.numeric(hrs_max)) %>%
summarise(hrs_max = sum(hrs_max, na.rm = TRUE))
```
```{r hrs_plots, fig.show='hold', out.width="50%"}
p3 <- g58 %>%
ggplot(g58,
mapping = aes(x = hrs_min,
y = count_occupation,
fill = sex)) +
geom_bar(stat = "identity",
position = "dodge") +
theme_bw() +
xlab("Minimum Hours") +
ylab("Count") +
ggtitle("Min hours worked at Occupation")
p3
p4 <- g58 %>%
ggplot(g58,
mapping = aes(x = hrs_max,
y = count_occupation,
fill = sex)) +
geom_bar(stat = "identity",
position = "dodge") +
theme_bw() +
xlab("Maximum Hours") +
ylab("Count") +
ggtitle("Max hours worked at Occupation")
p4
```
Row{data-height=250}
-----------------------------------------------------------------------
### Chart D
- It can be observed from figure that the most number of employees in the SA4 regions are employed in the occupations of Professionals, Managers and Technicians and trade workers. Professionals accounted for highest number of employees for region 206, while machinery operators and drivers accounted for the least number of employees for region 213 respectively.
```{r popareaoccupation, fig.cap=""}
popareaoccupation <- g58 %>%
group_by(SA4_CODE_2016, occupation) %>%
summarise(count_occupationarea = sum(count_occupation)) %>%
ungroup()
popareaoccupationmax <- popareaoccupation %>%
select(1:3) %>%
group_by(SA4_CODE_2016) %>%
slice_max(count_occupationarea) %>%
arrange(SA4_CODE_2016)
popareaoccupationmax %>%
full_join(sa4_geomap,
by = c("SA4_CODE_2016"="SA4_CODE")) %>%
ggplot() +
geom_sf(mapping = aes(geometry= geometry, fill=occupation)) +
geom_sf_text(aes(geometry= geometry,label=occupation, colour="white"), check_overlap=TRUE)+
theme_void() +
theme(legend.position = "bottom")
bestfield <- popareaoccupation %>%
select(1:3) %>%
group_by(occupation) %>%
slice_max(count_occupationarea) %>%
arrange(SA4_CODE_2016)
bestfield %>%
ggplot() +
geom_col(mapping = aes(x = reorder_within(occupation,count_occupationarea, SA4_CODE_2016), y = count_occupationarea, fill = occupation)) +
labs(title = "Region and Best Occupation", y = "Number of Employees") +
scale_y_continuous(label=label_number()) +
coord_flip() +
theme(legend.position = "none")
```
Maps {data-orientation=rows data-navmenu="Regions"}
=========================================
Column
-----------------------------------------
### Education Level: Region
```{r edmap, fig.cap="Spatial Education Level Distribution"}
popareaedu <- g46a %>%
group_by(SA4_CODE_2016, afq_level) %>%
summarise(count_eduarea = sum(count_edu_lvl)) %>%
ungroup()
popareaedumax <- popareaedu %>%
select(1:3) %>%
group_by(SA4_CODE_2016) %>%
slice_max(count_eduarea) %>%
arrange(SA4_CODE_2016)
popareaedumax %>%
full_join(sa4_geomap,
by = c("SA4_CODE_2016"="SA4_CODE")) %>%
ggplot() +
geom_sf(mapping = aes(geometry= geometry, fill=afq_level)) +
geom_sf_text(aes(geometry= geometry,label=afq_level, colour="white"), check_overlap=TRUE)+
theme_void() +
scale_fill_brewer() +
theme(legend.position = "none")
```
### Industry: Region
```{r indmap, fig.cap="Spatial Industry Distribution"}
popindarea <- g51 %>%
group_by(SA4_CODE_2016, industry) %>%
summarise(count_indarea = sum(count_industry)) %>%
ungroup()
popindareamax <- popindarea %>%
select(1:3) %>%
group_by(SA4_CODE_2016) %>%
slice_max(count_indarea)
popindareamax %>%
full_join(sa4_geomap,
by = c("SA4_CODE_2016"="SA4_CODE")) %>%
ggplot() +
geom_sf(mapping = aes(geometry= geometry, fill=industry)) +
# geom_sf_text(aes(geometry= geometry,label=industry, colour="white"), check_overlap=TRUE)+
theme_void() +
scale_fill_brewer() +
theme(legend.position = "none")
#major industry in cbd is helthcare
#major industry in country side is agriculture
```
Column
-----------------------------------------
### Field: Region
```{r fieldmap, fig.cap="Spatial Study Field Distribution"}
popareafield <- g47 %>%
group_by(SA4_CODE_2016, field) %>%
summarise(count_fieldarea = sum(count_field)) %>%
ungroup()
popareafieldmax <- popareafield %>%
select(1:3) %>%
group_by(SA4_CODE_2016) %>%
slice_max(count_fieldarea) %>%
arrange(SA4_CODE_2016)
popareafieldmax %>%
full_join(sa4_geomap,
by = c("SA4_CODE_2016"="SA4_CODE")) %>%
ggplot() +
geom_sf(mapping = aes(geometry= geometry, fill=field)) +
geom_sf_text(aes(geometry= geometry,label=field, colour="white"), check_overlap=TRUE)+
theme_void() +
scale_fill_brewer() +
theme(legend.position = "none")
```
### Occupation: Region
```{r occmap, fig.cap="Spatial Occupation Distribution"}
popareaocc <- g57 %>%
group_by(SA4_CODE_2016, occupation) %>%
summarise(count_occarea = sum(count_occupation)) %>%
ungroup()
popoccareamax <- popareaocc %>%
select(1:3) %>%
group_by(SA4_CODE_2016) %>%
slice_max(count_occarea)
popoccareamax %>%
full_join(sa4_geomap,
by = c("SA4_CODE_2016"="SA4_CODE")) %>%
ggplot() +
geom_sf(mapping = aes(geometry= geometry, fill=occupation)) +
# geom_sf_text(aes(geometry= geometry,label=industry, colour="white"), check_overlap=TRUE)+
theme_void() +
scale_fill_brewer() +
theme(legend.position = "bottom")
```